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Abstract 



■ We consider a dynamic version of sender-receiver games, where the sequence of 

. states fohows an irreducible Markov chain observed by the sender. Under mild as- 

sumptions, we provide a simple characterization of the limit set of equilibrium payoffs, 
as players become very patient. Under these assumptions, the limit set depends on the 
Markov chain only through its invariant measure. The (limit) equilibrium payoffs are 
the feasible payoffs that satisfy an individual rationality condition for the receiver, and 
• an incentive compatibility condition for the sender. 

(N 
m 

P : 1 Introduction 
o 

CN I Since Crawford and Sobel (1982), sender-receiver games, or cheap-talk games, have become 
a natural framework for studying issues of information transmission between a privately 
^ , informed 'expert' and an uninformed decision maker, where the two parties have non-aligned 
^ , interests. 



When the decision maker acts only once, the extent to which information can be shared 
at equilibrium has been studied extensively, when 'talk' takes place prior to the decision 
stage. While Crawford and Sobel (1982), see also Green and Stokey (2007), have focused 
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on the case where communication is hmited to a single costless and non-verifiable message 
from the sender to the receiver, more recent papers have shown that this restriction is not 
innocuous, and have characterized the equilibrium outcomes for general cheap-talk games, 
see Krishna and Morgan (2001), Aumann and Hart (2003)0 This work has been motivated 
by numerous concrete situations. We refer to Krishna and Morgan (2008), Farrell and Rabin 
(1996), and Sobel (2009) for a discussion of these applications. 

The present work is motivated by the following observation. Whether the sender is a 
financial advisor who provides advice to a client, an expert who is consulted on a project, or 
a referee on a project/person, the situation often calls for a dynamic approach. Indeed, the 
financial advisor provides advice on a series of investments, and the expert and the referee 
may be consulted on successive, related projects. 

Golosov, Skreta, Tsyvinsky and Wilson (2009) consider such a situation. They assume 
that the sender repeatedly sends messages, the receiver repeatedly makes decisions, while 
the state of the world remains fixed throughout. Within the Crawford and Sobel framework 
(continuum of states/messages), they show that, for some specifications on the initial dis- 
tribution on states, (necessarily complex) equilibria exist, that achieve full revelation of the 
state of the world in finite time. 

We here deal with situations in which the state of the world may change through time. 
Specifically, we assume that the successive states form an irreducible Markov chain over some 
finite set. In every stage, the sender issues a message/recommendation, and the receiver 
makes a decision. States are only known to the sender, and payoffs only depend on the 
current state and on the receiver's decision, but not on the message sent by the sender. 

Since states are autocorrelated, any information disclosed in stage n provides valuable 
information in later stages as well, as in Golosov et al. (2009). Yet, since the Markov Chain 
is irreducible, this information becomes eventually valueless. 

Intuitively, the inter-temporal situation puts some restrictions on the players' behavior. 
As an illustration, the opinion of an expert who systematically provides laudative reports 
will eventually come to be discounted, if not ignored, since the decision maker is aware of 
the fact that the time-average report of the quality of people/projects should refiect the 
invariant measure of the states of the world. On the other hand, an expert who genuinely 
provides accurate information to promote efficiency, but sees that the decision maker only 
acts in his interests, may become wary and may stop to provide valuable information to 
the decision maker. As is well-known from repeated games, the sender may indeed provide 

^The case of verifiable messages has also been studied in detail, see Forges and Koessler (2008). 
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powerful incentives by conditioning his future communication policy on the behavior of the 
decision maker. Similar insights already appear in the literature on dynamic contracting, 
see Baron and Besanko (1984), Besanko (1985) or Battaglini (2005). 

Our paper relates to the recent and growing literature on incomplete information games, 
in which the uncertainty evolves, see, e.g., Athey and Bagwell (2008), Mailath and Samuelson 
(2001), Phelan (2006), Renault (2006), Wiseman (2008), and Horner, Rosenberg, Solan and 
Vieille (2010) and, especially, Escobar and Toikka (2010). 

We provide a characterization of the limit set of sequential equilibrium payoffs, when 
players are very patient. 

Our main findings are the following. We first show (Theorem 1) that a feasible payoff 
vector is a (limit) equilibrium payoff as soon as the following two conditions are met. On 
the one hand, the payoff of the receiver should be at least his babbling equilibrium payoff. 
This condition is an individual rationality condition. Indeed, the latter payoff is equal to 
the receiver minmax payoff in the dynamic game since the receiver has the option to ignore 
the announcements of the sender. On the other hand, the sender's payoff should satisfy 
an incentive compatibility condition, which reflects the fact that the sender has the option 
of substituting artificially generated states to the true ones when playing, as long as the 
artificial states are statistically undistinguishable from the true ones. As it turns out, this 
incentive constraint takes the form of finitely many linear inequalities. 

In the corresponding equilibria, with high probability the sender truthfully reports the 
current state most of the time , while the receiver responds in a stationary manner to the 
announcements of the sender, and checks that the distribution of these announcements is 
con.i=tent with the invariant mea^utei 

We next show (Theorem 2) that the converse inclusion holds under some additional con- 
dition on the Markov chain, which we call Assumption A: any limit equilibrium payoff must 
satisfy the individual rational condition and the specific version of the incentive compatibility 
requirement of Theorem 1. 

A noteworthy consequence is that, under Assumption A, the limit set of equilibrium 
payoffs does not depend on how successive states are correlated, nor on fine details of the 
sequence of states, but only on the invariant measure. It is also irrelevant whether the sender 
learns some, or even all, of the realization of the future states in advance. In particular, the 
set of equilibrium payoffs can be computed as if successive states were independent. 

Our results are valid for a large (open) class of payoff functions for the static game, 

^While this is reminiscent of the revelation principle, we must stress that no revelation principc applies 
in our setting. 
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but not for all of them. More precisely, we prove that for generic payoff functions (and 
under Assumption A), either our results hold, or all equilibria of the repeated game are 
payoff-equivalent to babbling equilibria. 

The paper is organized as follows. The model is described in Section O In Section [3] we 
explain most insights by means of an example. The main results appear in Section |U together 
with an illustration. Proofs are discussed in Section |5] and the Appendix. Additional results 
and comments are provided in Section ??. The Appendix contains all proof details. 

2 Model 

We study dynamic sender-receiver games, in which the state of the world changes through 
time. At each stage n > 1, the sender (player 1) observes the current state of the world 
Sn G S, and makes an announcement a„ G A. Upon observing a„, the receiver (player 
2) chooses an action 6„ G B. The current action 6„, together with the current state s„, 
determines the utility vector tt(s„, 6„) G at stage n. Only the action 6„ is then publicly 
disclosed. We thus maintain the assumption that payoffs are not observed. The two players 
share a common discount factor 6. 

We assume throughout that the set of states S, the set of messages A, and the set of 
actions B, are finite. We also assume that there are at least as many messages as states. This 
assumption ensures that the only motives for concealing the state are strategic. We thus leave 
aside situations in which, due to capacity constraints, the sender might be forced to choose 
which feature of the state to reveal. For simplicity, we will actually assume throughout that 
the set A of messages coincides with the set S of states. (As will be seen, this assumption is 
without loss of generality in our setup.) 

We assume that the states (s„) follow a Markov chain over S, with transition function 
p{- I ■), which is irreducible and aperiodic|§ The Markov chain therefore admits a unique 
invariant measure, m G A(S'). For convenience, we assume that the first state, Si, is drawn 
according to m. This ensures that the law of s„ is equal to m, for every n > 1. 

In this setup, a strategy of the sender maps past and current realized states, and past 
play, into a mixed message, and is thus a map cr : U„>o(5' x A x B)" x S ^ while a 

strategy of the receiver is a map r : U„>o(A x 5)" — )• A{B). A stationary strategy of the 

■^That is, for any two states s,t ^ S, and for every G N large enough, the probabihty of moving from 
s to t in exactly N stages is positive. 
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receiver is a map y : A ^{B), with the interpretation that the receiver chooses his action 
according to ?/(■ | a) G A (5) whenever told a E A. 

Our goal is to study to what extent the dynamic structure of the game affects the equi- 
librium outcomes. Formally, we aim at providing a characterization of the limit set of 
sequential equilibrium payoffs, and at understanding equilibrium behavior, when players are 
very patient. 



3 An Example 

We here illustrate our main results by means of a simple example. There are two states, 
5* = {L, R}, and two actions for the receiver, / and r. Successive states are independent and 
equally likely. Payoffs are given by the two tables in Figure 1, where c is a fixed parameter, 
with c G (1, 2). The sender and the receiver are respectively players 1 and 2. 

I r I r 



c,2 



2,1 



1,-1 



2,1 



State L State R 

Figure 1: The payoffs of the two players. 
The one-shot information transmission game has a unique equilibrium, in which the 
receiver plays r with probability 1. To see this, note that the sender strictly prefers action r 
over action no matter what the state is. Thus, at equilibrium, all messages that are sent 
with positive probability induce the same mixed action by the Receiver. This constant mixed 
action, being always ex post optimal for the Receiver, is therefore also ex ante optimal. It 
must thus assign probability one to action r. 

All equilibria in the one-shot game are therefore babbling equilibriajf] Plainly, the dy- 
namic game admits a babbling equilibrium, in which the sender repeatedly makes the same 
announcement, the receiver treats the announcements as being non- informative, and plays 
r in every stage. On the other hand, the receiver can always choose to ignore the announce- 
ments of the sender, and to play r in every stage, thereby getting 1. As a result, the babbling 
equilibrium is the worst equilibrium for the receiver, in both the one-shot and in the dynamic 
game. 

We claim that the dynamic game has equilibrium payoffs that are arbitrarily close to 



"'In the sense that the action of the receiver is independent of the message sent by the sender. 
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I). In particular, and in contrast with the receiver, there are equihbrium payoffs for 
the sender that are below the babbling equilibrium payoff. Here is the intuition. The sender 
announces the true state at every stage. The receiver listens to the announcements of the 
sender, and plays / when told L, and r when told R. To prevent the sender from announcing 
R in every stage, the receiver monitors the announcements of the sender, and stops listening 
if there is an obvious bias (towards either L or R). Under the constraint that he should 
announce both states equally often, the expected payoff of the sender is highest when he 
reports truthfully. 

While this intuition is simple, formalizing it into an equilibrium of the discounted game 
is not straightforward. Indeed, because payoffs are discounted, the sender may have a pref- 
erence to send at first the message R more frequently. 

Wc start with a simple construction that yields an equilibrium payoff distinct from (2, 1). 

4 - 2c 

Assume that the discount factor satisfies 6 > , and consider the following strategy 

3 — c 

profile. 

• At odd stages, the sender announces truthfully the current stage, and the receiver plays 
I if told L, and r if told R. 

• At even stages, the sender announces a constant message, and the receiver plays the 
action that he did not play in the previous stage. 

• If the receiver deviates, both players switch to the babbling equilibrium forever. 

respectively. Because a deviation of the receiver is followed by the babbling equilibrium, 
which yields 1 to the receiver, and because the (conditional) expected payoff of the receiver 
is at least 1 in every stage, no deviation of the receiver is profitable. Regarding the sender, 
it is sufficient to show that he cannot profit by deviating in any block of two stages. In such 
a block the sender has two possible deviations: to announce L in the first stage of the block 
when the true state is R, and to announce R in the first stage of the block when the true 
state is L. In the former case, he gets 1 at the first stage and 2 in the second (instead of 2 
at the first stage and ^ at the second stage if he annoimces truthfully). In the latter case, 
he gets 2 at the first stage and ^ at the second stage (instead of c at the first stage and 2 
at the second stage if he announces truthfully). The choice of S ensures that none of these 
deviations is profitable. 
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To get payoffs closer to {^^, |), we will be relying on a slightly more complex construc- 
tion. We let the size 2N of a block be large enough, so that a law of large numbers will 
apply. Once is fixed, we let the discount factor 6 be high enough, so that the contribution 
of any individual block to the overall discounted payoff is very small. 

We first describe a pure strategy r of the receiver. In each block (unless if the receiver 
has deviated earlier), the receiver listens to the sender's announcements, plays / if told L, 
r if told R, until the number of announcements of either L or R exceeds A^. When this is 
the case, the receiver stops listening to the sender's announcements, and repeats the least 
frequent action uutil the e„d of the current blocki In a sense, the sender is restricted to 
announcing both states equally often in any given block of 2N stages. As such, the intuition 
here is similar to some extent to the one behind the linking mechanism of Jackson and 
Sonnenschein (2007) and, even more, to the analysis in Escobar and Toikka (2010)@ 

If indeed the sender reports truthfully the current state, there is a high probability that 
the receiver will be listening to the sender most of the time, and the expected payoff is 
therefore close to (^, §)• 

In contrast with the situation examined above, it need not be optimal for the sender to 
report truthfully when facing r. However, a crucial insight is that any best reply of the 
sender to r must be reporting truthfully most of the time, with high probability. To see 
why, observe that any best reply achieves a payoff of at least, say, ^ — e. But since the 
receiver plays both actions / and r equally likely on each block, this implies that with high 
probability the action of the receiver matches the state, most of the time. 

We let cr be any pure best-reply of the sender to r. On the equilibrium path, we let 
players play according to a and r. By construction, the equilibrium property holds for the 
sender. To deter the receiver from deviating, both players switch forever to the babbling 
equilibrium once a deviation of the receiver is detected. Since blocks are short, the expected 
continuation payoff of the receiver is close to | following any history, while the receiver gets 
a payoff of 1 (or close to 1) if he deviates. 

^An alternative construction, that we adopt in the general case, is for the receiver to generate a specific 
sequence of fictitious announcements, and continue as if the sender's announcements were equal to the 
fictitious ones. 

^The present analysis and the one in Escobar and Toikka (2010) were developed independently. 
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4 Main Results 



We here state and discuss two results on the hmit set of equihbrium payoffs. Loosely speak- 
ing, according to Theorem [H all payoff vectors that are individually rational for the receiver 
and incentive compatible for the receiver are (asymptotically) equilibrium payoffs. Theorem 
[2] proves the converse inclusion. Further results are provided in Section ??. 

4.1 Theorem [1] 

We start with some notations. We denote by C A{S x A) the set of copulas based on m; 
that is, the set of distributions fi over S x A whose marginals on S and on A are both equal 
to mill The set Ai is defined by a finite number of linear inequalities, hence it is a compact 
convex polyhedron, so it has finitely many extreme points. 

We denote by G the specific distribution defined as fio^s, s) = m{s) for each s E S, 
and /io(s, a) = if s 7^ a. Under hq, the messages and the states coincide a.s. Thus, the 
distribution /xq is the long-run average distribution of the sequence a„)„ when the sender 
reports truthfully the current state. 

Given a copula G A^, and a stationary strategy y : A ^{B), we set 



seS,a£A 

This is the expected payoff vector when the sender's report is drawn according to | s), 
and the receiver plays y. 
We denote by 



the babbling equilibrium payoff for the receiver. 

Definition 1 We let E{Jli) denote the set of payoff vectors U{iJ,o,y), where y : A ^(-B), 
that satisfy 

CI. U^{iio,y) > U^{n,y) for every /i G Al. 





(1) 



C2. U\fio,y)>v', 



We define E{Ai) as the set of payoff vectors U{fio,y) G E{M.) where the inequalities in 
CI and C2 are strict. That is, E{Ai) is the set of vectors U{fio,y), y : A ^ A{B)), such 
that 



^Recall that the set A of messages is a copy of S. 
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Dl. U^{fio,y) > U^i.lJ'^y) for every n e M, Hq. 
D2. U\^l,,y)>v\ 

Note that condition Dl holds as soon as the inequahty U^{fj,o,y) > U^{fi,y) is satisfied 
for each of the finitely many extreme points ^ ^ oi M.. 

We denote by SE^ the set of sequential equilibrium payoffs of the game with discount 
factor 5. Our first main result, Theorem [H shows that all payoffs in E{M.) can be obtained 
as equilibrium payoffs, provided that players are sufficiently patient. 

Theorem 1 Suppose that there exists a public randomizing device, which outputs a (uni- 
formly distributed) number in [0,1] in every stage, after the announcement of the sender. If 
E{M) ^ ^ then 

E{M) C lim inf SEs. 

<5-i>l 

Theorem [T] means that for every 7 G E{M.) and for every 5 > 0, there exists 5q < 1 such 
that, for every 5 > the 5-discounted game has a sequential equibrium payoff within e of 
7. § The proof of Theorem [1] is provided in Section 15.11 

Few comments are in order. 

The babbling payoff v"^ is equal to the minmax value of the receiver in the dynamic 
game. Hence condition C2 in Definition [1] reads as an individual rationality condition. 
Condition CI is akin to an incentive compatibility condition: under the constraint that the 
distribution of messages is equal to the distribution of states, truth-telling is optimal for the 
sender. According to Theorem [H any payoff vector U{fio,y) that is incentive compatible for 
the sender, and individually rational for the receiver, is an equilibrium payoff for d large. 

Our construction will have the somewhat surprising feature that the sender reports truth- 
fully, at least most of the time and with high probability. A direct intuition can be provided, 
that is reminiscent of the revelation principle in mechanism design. Let an equilibrium (a, r) 
be given. Consider the strategy profile where the sender reports truthfully, and the receiver 
first computes the message that the strategy a would have sent, and next plays what r would 
have played given this message. We argue loosely that this new profile (when supplemented 
with threats) is an equilibrium. The key to the argument is twofold. On the one hand, 

^We will actually prove the stronger statement that Sq can be chosen to be independent of 7: 

lim sup (i(7, SEg) = 0. 
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the sender can check that the receiver does indeed play as prescribed, and does not use the 
additional information provided by the knowledge of the true state. On the other hand, 
the threat of switching to the babbling equilibrium is effective because the knowledge of the 
state at a given stage becomes eventually valueless in predicting distant stages, because of 
the irreducibility property of the sequence of states. However, we should stress that no rev- 
elation principle applies in our setup, and our equilibrium construction relies on the threat 
that the sender will stop providing information following a deviation. 

Theorem [1] relies on two assumptions. The public randomizing device can easily be dis- 
pensed with, provided one slightly extends the communication options offered to the players. 
To be specific, assume that the players are allowed to exchange simultaneous 'messages', after 
the sender has reported a state. Under such an assumption, players can implement jointly 
controlled lotteries as in Aumann and Maschler (1995), which can substitute for the random- 
ization device. Details are standard and omitted. However, when instead communication is 
restricted to a single message sent by the sender, then the existence of a public randomizing 
device is not without loss of generality, see Section 16.21 for an example. 

Theorem [1] also requires E{Ai) to be non-empty. This is similar to the non-empty interior 
type of conditions which appear in Folk Theorems. Yet, we must stress that our assumption 
is somewhat stronger, since E(Ai) need not be equal to the relative interior of E(M.), and 
the condition that E{M.) 7^ is not generically satisfied. We provide a robust example 
where E{Ai) = and elaborate further on this issue in Section \67L[ 

It is not an easy task to rely on Definition [T] to check whether a given payoff vector U{fi, y) 
belongs to E{Ai). Fortunately, it turns out that conditions CI and Dl are equivalent to 
much simpler conditions. 

Lemma 1 Let y : A ^ ^{B) be given. Conditions CI and Dl are respectively equivalent 
to conditions C'l and D'l below. 

C'l u^{s, y{- I $)) > u^{s, y{- \ (f){s))), for every permutation over S. 
seS ses 



D'l y u (s, y{- I s)) > > u (s, y{- \ (j){.s))), for every permutation (p over S that is not the 



identity mapping. 

The proof of Lemma [1] is in the Appendix. Interestingly, conditions C'l and D'l do 
not involve the invariant distribution m. The intuition is best explained in the case of two 
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states, So and si. Assume that the sender is considering mis-reporting the state, under the 
constraint that the distribution of reports matches the invariant distribution m of the state. 
The only way to do this is to report Si instead of sq, as often as to report sq instead of si. 
Whether such a deviation is profitable is equivalent to asking how the unweighted sum of 
the payoffs obtained in sq when reporting si and in si when reporting sq compares to the 
unweighted sum of the payoffs obtained in the two states when reporting truthfully. 

4.2 Theorem [2] 

Our second main result provides the converse inclusion to that in Theorem [H It requires 
one substantive assumption on the behavior of the state. Assumption A below. 

Assumption 1 (Assumption A) There exist nonnegative numbers ag, s E S, with < 

ses\{s} 

1 (for every s G S), such that p{s' \ s) = as' whenever s' ^ s. 

Assumption A is restrictive. Yet it does e.g. hold in the following cases. 

Assume first that changes in the state are due to shocks, which occur at random times. 
Once drawn, the state remains constant until a shock occurs. The state is then drawn 
anew, according to m. The inter-arrival times of the successive shocks are i.i.d., and follow 
a geometric distribution. In that case. Assumption A is met. Indeed, it suffices to set 

= TT X m{t) for every t E S, where n is the per-stage probability of a shock. The 
parameter tt is here a measure of the state persistence. When n increases from to 1, the 
situation evolves from one in which the state remains constant through time, to a situation 
in which successive states are independent. 

When 7r = 1, the successive states are independent, and identically distributed according 
to m. Thus, Assumption A holds in the case of i.i.d. states. 

Assumption A also holds in the benchmark case where there are only two possible 
states. Indeed, denoting the two states by Si and S2, it suffices to set ai = p{si \ S2) and 
a2 = p{s2 \ Si). In particular, it is satisfied in the models in Athey and Bagwell (2008), 
Phelan (2006) and Wiseman (2008). 

As a further simple illustration, consider a symmetric random walk on three states. That 
is, whenever in a state, the chain moves to each of the two other states with probability |. 
Again, Assumption A is met, with = | for each s E S. 

We denote by NEs the set of (Nash) equilibrium payoffs in the game with discount factor 

6. 
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Theorem 2 Suppose that Assumption A holds. Then, for every 6 < 1, one has 

NEs C E{M). 

Provided that E{M.) ^ % and that Assumption A is met, Theorems [U and |2] thus imply 
that the set of sequential equihbrium payoffs SEs converges to the set E{Jv[) (as soon as a 
randomizing device is available). 

Note that the set M. of copulas only depends on the invariant measure m, and not on 
finer details of the transition function. A striking implication of the characterization is that, 
under Assumption A, the limit set of equilibrium payoffs therefore only depends on the 
invariant measure m. In particular, the limit set of equilibrium payoffs is the same as when 
the states are drawn independently across stages. That is, the amount of state persistence 
is irrelevant for the determination of the limit set of equilibrium payoffs. 

If the initial state s were to remain fixed throughout the play, the game would fall into the 
class of repeated games with incomplete information introduced by Aumann and Maschler 
(1995). (This is the setup studied in Golosov et al. (2009).) In this case, the limit set of 
discounted equilibrium payoffs, when 5 goes to 1, is typically not equal to E{M.). Hence, 
there is a discontinuity in the limit set of equilibrium payoffs when successive states become 
perfectly autocorrelated|§ 

By contrast, for a fixed discount factor, the set of equilibrium payoffs is upper hemi- 
continuous with respect to the transition function. The source of this apparent paradox can 
be traced back to the fact that, in loose terms, the convergence of the set SEs to E{M.) is 
slowlier, the more correlated successive states are. 

The main insight to be derived from Theorem [2] is the following. The incentive compat- 
ibility condition C2 is a very strong one. Indeed, it only requires from deviations that the 
distribution of announcements matches the invariant measure. In particular, according to 
Theorem |2l all equilibria are payoff equivalent to equilibria in which the receiver only checks 
that the announcements frequencies are consistent with m. Yet, much more sophisticated 
checks would be available to the receiver. The receiver might e.g. check that the empirical 
distribution of two-letter words (s, s') matches the transition function as in Escobar and 
Toikka (2010), or look at the distribution of three-letter words, etc. This might potentially 
allow the receiver to impose weaker incentive constraints than the one in CI, and therefore, 
allow for equilibrium payoffs outside of E[M.). Theorem |2] thus identifies one class of Markov 
chains for which this is not the case. 

^Our Theorems [T] and [H extend to cover the case of uniform equihbrium payoffs. 
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We provide below an example where the conclusion of Theorem [2] fails to hold if As- 
sumption A is not satisfied. Thus, in general, the limit set of sequential equilibrium payoffs 
does not only depend on the invariant measure, but also on finer details of the transition 
function. 

Example 1 Consider a game with 5 states S := {0,1,2,3, A} . The sequence of states follows 
a random walk on S. When in s, the chain moves either to s + 1 (mod 5) or to s — 1 (mod 5) 
with equal probabilities. The action set B of the receiver coincides with S, and the payoff 
function is described in Figure 2, where c > 1. 







6 = 


b = 1 


6 = 2 


6 = 3 


6 = 4 


s 


= 


1,1 


c,0 


0,0 


0,0 


0,0 
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= 1 


c,0 


1,1 
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0,0 
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0,0 


0,0 


1,1 


0,0 


0,0 
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= 3 


0,0 


0,0 


0,0 


1,1 


0,0 


s 


= 4 


0,0 


0,0 


0,0 


0,0 


1,1 



Figure 2: The game in ExampleUl 



Thus, both players receive a payoff 1 if the action matches the current state, and 
otherwise, except when the receiver chooses action 1 in state 0, or action in state 1. 

The payoff vector (1, 1) is not in E{Ai) as soon as c > 1. Indeed, the stationary strategy 
y : A ^ A{B) defined by y{s | s) = 1 is the only strategy such that U{fio,y) = (1, 1). But 
then, the sender profits by reporting t = 1 whenever s = 0, and t = whenever s = 1. 
On the other hand, (1,1) is an equilibrium payoff, as soon as c < |, provided the players 
are patient enough. Indeed, consider the strategy of the receiver in which he matches the 
announcement of the sender, as long as |a„+i — a„| = 1 modulo 5, and switches forever to the 
babbling equilibrium {e.g., playing always 6 = 4) if |a„+i — a„| 7^ 1 modulo 5 for some stage 
n. Provided c is not too large, the best response of the sender is to report the true state. If 
instead, say, the sender chooses to report t = 1 when in fact s = in a given stage, he gains 
c — 1, but then in the next period, with probability | the new state will be s = 4, and then 
he will either report t G {0,2} and receive 0, or report t G {1,3,4} and be punished with 
the babbling equilibrium payoff | forever. Provided the players are patient enough, such a 
deviation is not profitable. 

In this equilibrium, the receiver checks that the one-step transitions between successive 
announcements are consistent with the transitions of the Markov chain. As it turns out, 
under Assumption A, such a sophisticated statistical analysis of the announcements is not 
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more powerful than a statistical analysis which is based only on the empirical frequencies of 
the different announcements. ♦ 

Theorems [T] and [2] hold as soon as the sender knows the current state. As will be clear 
from the proof, they continue to hold if the sender knows more. In particular, they hold in 
the extreme case where the sender learns the entire sequence of realized states in stage 1, or 
in any intermediate setup. 

Note that we interpret the case 5 — )■ 1 as players being very patient. It is not possible here 
to interpret it as a situation in which players would interact more and frequently. Indeed, 
a proper analysis of this latter case would take into account the impact on transitions: 
when players interact more frequently, states become more persistent between successive 
interactions. 

4.3 An illustration 

We here analyze a simple, specific example to show how to pin down the set of equilibrium 
payoffs using our results. We let the set of states be 5" = {so,Si,S2}. Between any two 
stages, the state changes with probability one, and each of the two possible states is equally 
likely. Thus, p{t | s) = | for every s t E S. Note that Assumption A on the transition 
function does hold, and that the invariant measure assigns probability | to each state. 

The receiver has three actions, denoted L, M and R, and the payoffs in the different 
states are given by the matrix 

/ 1,1 0,0 0,0 \ 

0,0 1,0 0,1 
\ 0,0 0,1 1,0 / 

where each row corresponds to a state, and each column to an action. For instance, the first 
row specifies the payoffs in state Sq, as a function of the action of the receiver. 

All extreme points of the feasible set are obtained by having the sender report truthfully 
the state, and the receiver then play a pure, state-dependent, action. Thus, all extreme 
points are obtained by picking one entry in each row, and averaging. For instance, picking 
L (resp. M, R) in row sq (resp. si, S2) and averaging over states leads to a payoff of (1, |). 
One checks that the feasible set is the convex hull of the five payoffs (0,0), (0,|), (|,1), 
(|,0) and (1,|), see Figure 3 below. 
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7^ 



(ii) 



The feasible set 



(0,1) 




The equihbrium set 

(1,1) 



(1,0) 
Figure 3 



Without any information on the state, all three actions of the receiver yield |, hence v"^ = ^. 

Let 7 = U{^o,y) G E{Ai) be a (limit) equilibrium payoff, and denote by fi the cop- 
ula obtained when the sender exchanges the two states si and S2 when reporting. Thus, 
/^(■Si,S2) = /i(s2,si) = fi{so,so) = |, and fi{s,t) = otherwise. Since the payoffs of the 
two players are exchanged in the two states Si and S2, one has U^{fi,y) = U'^{fio,y) and 
U^{fio,y) = U^{fi,y). The incentive condition U^{fiQ,y) > U^{fi,y) thus yields 7^ > 7^. 

Note finally that the sum of the players' payoffs cannot exceed 2 in state sq, and 1 in 
states si and S2- Thus, 7^ + 7^ < |(2 + 1 + 1) = |. 

Hence, any equilibrium payoff (7^, 7^) lies in the shaded triangle defined by the inequal- 
ities 7^ > |, 7^ + 7^ < |, 7"*^ > 7^, see Figure 3 below. 

On the other hand, each of the extreme points of this triangle is an equilibrium payoff. 
Indeed, (|, |) is the babbling equilibrium payoff, while (|, |) = U{^o,yi) and (1,|) = 
[/(/io,?/2), where yi : A A{B) plays L when told sq, and randomizes between M and R 
otherwise, while ?/2 plays L, M and R in states Sq, Si and S2 respectively. 

As a result, the set of equilibrium payoffs is equal to the shaded triangle in Figure 3. 



5 Proofs 

5.1 Proof of Theorem [1] 

We here provide most details of the proof of Theorem [TJ Some technical details are in 
the Appendix. Since E{M.) 7^ 0, there is yo : A ^ ^(-B) such that ?7^(/io,?/o) > "^^ and 
f^H/"o,l/o) > U'^{fJ',yo) for every fi e M, n fiQ. 
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Let e > and U{fMo,y) ^ E{Jli) be arbitrary, and define y := eyo + (1 — £)y- It is 
sufficient to prove that U{iiQ,y) is arbitrarily close to some sequential equilibrium payoff of 
the 5-discounted game, provided 5 is high enough. 

5.1.1 The strategies 

Let some integer e N, and a discount factor 6 be given. We here define a strategy profile 

(c^*, 7"*). 

According to ((T*,r*), the play is divided into consecutive blocks of N stages. At the 
beginning of each block, players discard past information, and re-start playing a iV-stage 
profile ((Jo, To), where tq is a pure strategy. In case the receiver deviates from the pure 
strategy tq, the players switch to babbling play forever. 

We now construct ctq and tq, starting with tq. Consider any block of N stages. According 
to To, the receiver "listens" to the reported state a„ in stage n and plays y{- \ an), as long no 
state has been reported too often. As soon as this fails to be the case, the receiver substitutes 
to the actual report of the sender some fictitious report 9n, and plays according to y{- \ On)- 

To be formal, we pick a distribution m^v G A(5') which best approximates the invariant 
measure m, among all distributions m e A(5') such that Nrh{s) is an integer for all s. 

For s e S and n e N, we denote by N„(s) — \{k < n : ak — s}\ the number of stages 
where the sender reported state s, and we set 

q := min{l <n<B: 'Hn{an) > Nm]si{an)} 

(min0 = +oo). Intuitively, each state s e 5" is allotted a quota of announcements equal to 
N mjq{s). The stage q is the first stage in which quotas are no longer met. From stage q 
until the end of the block, the receiver substitutes fictitious reports to actual ones. 
Formally, we let {On) (n = 1, . . . , A?") be a sequence such that 

Fl. On — dn for n < q; 

F2. For each s & S, the equality \{n < N : On = s}\ = NmN{s) always hold; 
F3. Conditional on (oi, . . . , Ug), the variables {Og, . . . , On) are deterministic. 

We will refer to On as the announcement at stage n. Condition Fl means that the an- 
nouncements coincide with the sender's actual announcements prior to stage q; condition 
F2 ensures that the entire sequence of announcements always satisfies the quotas; condition 
F3 ensures in particular that the fictitious announcements are commonly known between 
the two players. 
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Thanks to the pubhc randomizing device, the strategy tq may be rewritten as a pure 
strategy. 



The strategy ctq is defined to be any pure best-reply strategy of the sender to tq in the 
A^-stage 5-discounted game starting in stage 1. Note that the strategy (Tq is also a best-reply 
to To on any of the consecutive blocks of stages, conditional on past play0 In particular, 
cr* is a best-reply to t^. 

5.1.2 Equilibrium properties 

We here argue that, for appropriate choices of and of S, r* is a best-reply to cr*, and 
) induces a payoff a« close to 

Proposition 1 For every r] > 0, there exists A^o ^ N such that the following holds. For 
every N >N, there is Sq < 1 such that, for every S > 6q, the profile (a*,r=i,) is a sequential 
equilibrium and induces a payoff within 77 of U{fiQ, y). 

The complete proof of Proposition [1] is in the Appendix. The crucial step consists in 
showing that the fact that ctq is a best-reply to tq implies that, with high probability, 
reports the true state in most stages. This is the content of Lemma [2] below. In the statement 
of the lemma, jJiao^ro is the (expected, undiscounted) joint distribution of states and reports 
in a block of A^ stages. That is, for each (s, a) G S* x S", 



is the expected frequency of the pair (s, a) over A^ stages (recall that the distribution of the 
initial state, si, is the invariant measure). 

Lemma 2 For every i] > 0, there is No G N, such that the following holds. For every 
N > Nq, there is 6q < 1, such that, for every 6 > 6q, one has 

||/^ao,ro -/^Oll < V- 

^°Indeed, denote by X„ ~ i^([0,l]) the output of the public device in stage n, and label the receiver's 

actions from 1 to \B\. We let the strategy tq instruct the receiver to choose the action b & B whenever 

6-1 b 

y{i I dn) < Xn < y{i \ On)- In effect, the device is performing publicly the desired randomization. 

i=l i=l 

^"'^This observation relies on the fact that, in the first block, ctq is a best-reply to tq, no matter what the 
distribution of the initial state Si is. 

^^Off-equilibrium path beliefs and sequential rationality issues are discussed in the Appendix. 




17 



We will provide insights into the proof of Lemma [2] below. For the time being, we show 
how to deduce Proposition [1] from Lemma [2l Observe first that, by definition of /io-o.ro; the 
expected awerag'Jlfl payoff induced by (o"o, tq) over a single block is equal to U{^ao,To, y)- For 
fixed A^, and since the profile (a*, r*) consists in periodic repetitions of (ctq, tq), the discounted 
payoff induced by (cro,ro) therefore converges to t/(/^(To,TO) 2/) as 5 — t- 1. In particular, it is 
thus arbitrarily close to the target payoff U{fio,y). 

We now argue that r^, is a best-reply to cr*. Following any history, the continuation payoff 
of the receiver is equal to the sum of his payoffs until the end of the current block and of 
the continuation payoff from the next block on. The latter is equal to the discounted payoff 
induced by (cr*,r*), computed using the behef held by the receiver at the beginning of this 
block. For fixed N, this continuation payoff thus converges to (/"^{fi^jg^ro^v) as 5 — )■ 10 

On the other hand, any deviation from tq, say in stage n, triggers a babbling play, and 



where pk is the belief that the receiver will hold at stage A; > n on the current state Sk- Since 
the sequence of states forms an irreducible and aperiodic chain, pk converges to m. For fixed 
A^, this continuation payoff therefore converges to f ^ as 5 — )■ 1 (again, uniformly over all 
histories). 

Since t^^(/i<To,ro! 2/) > v"^, this proves the best-reply property of tq, provided first N, and 
then 6, are chosen large enough. 

We now turn to Lemma [2l We denote by atruth the strategy of the sender that announces 
truthfully the current state, no matter what. The proof of Lemma [2] combines several ideas. 

First, by a law of large numbers for Markov Chains, and if is large enough, there is 
a high probability that the realized state frequencies will be consistent with the quotas in 
most stages. Thus, under {atruth, tq), there is a high probability that the receiver follows the 
announcement of the sender in most stages. That is, the distribution fiatruth,ro is arbitrarily 
close to fiQ. 

Next, for fixed A^, and for every (periodic) strategy a (and viewing tq as a periodic 
strategy), the discounted payoff 75(0-, tq) converges to U{fia,To,y) as 6 converges to one. 
Finally, the best-reply property of uo implies that 7^((7o,to) > "y}{<Jtruth,To)- 

^•^That is, when payoffs in the different stages are not discounted. 
""^^Uniformly over all histories. 
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Combining these observations, the following formal statement holds. For every e > 0, 
there exist A^o and 60 such that, for every S > 60, the following sequence of inequalities holds: 



U\fiao,To,y) > ll{(^o,ro)-e > -fl{atruth,To)-e > -2e > f/^(^o, 1/) -3e. (2) 

To conclude, we will rely on Lemma [3] below, which critically depends on the assumption 
that [/(/io, mo) G E{Ai). Denote by Aie the (finite) set of extreme points of Ai. Recall that 
/io G Aie- Set 

ci := min (f/^(/io, 2/o) - f^^(/ie, 2/o)) , and C2 := max ||^e-/io||i, 
and note that both Ci and C2 are positive. 
Lemma 3 For every ^ E Ai, one has 

Lemma |3] may be paraphrased as saying that any strategy that does approximately as well 
as the truth-telling strategy must be telling the truth in most stages, with high probability. 
The conclusion of Lemma [2] follows from (|2]) combined with Lemma [31 

5.2 Proof of Theorem H 

We here provide insights into the proof of Theorem O We let 5 < 1, and we fix a Nash 
equilibrium (a, r) of the (5-discounted game (with or without randomizing device). For 
clarity, we sometimes use boldfaced letters to denote random variables. 

For s G S", we define y{- \ s) G A{B) as the expected discounted distribution of moves of 
the receiver in state s. Formally, for s G S", and b E B, we set 

By construction, one has 75(0", r) = U{fio,y). Indeed, 

00 00 
75(a,r) = (l-<5)5]5"-^E.,,[^/(s„,b„)] = (l-5)5^5"-^ J2 ^^Aks.=s,^.=b}Hs,b) 

n=l n=l s£S,b£B 

u{s,b)m{s)y{b\ s) = U{fXo,y), 

s£S,b£B 



y{b 



Ea.T 



X^(l-<5)5"-ilK=.,b„= 



n=l 



19 



as desired. 

Since the distribution of s„ is equal to m for each stage n G N, one has 7|((t, r) > v"^, 
and thus, t/^(/io,y) > v'^, so that C2 holds. 

We thus need to prove that U^{fio, y) > y) for each fi E Ai. The idea of the proof 

is rather straightforward, but the formal proof is fraught with many technical complications. 
Let fi E Ai he given. We will construct a strategy a' of the sender such that 75(0"', r) = 
f/(/i, y), so that the desired inequahty will follow from the equilibrium property of (a, r). 

The strategy a' is designed as follows. Along the play, the sender will generate a sequence 
(t„) of fictitious states that is statistically indistinguishable from the sequence (s„), and such 
that the average distribution of the pair (s„,t„) is given by fj,. Given such a sequence, in 
any stage n, the sender will substitute the fictitious state t„ to the realized state s„ in 
playing a. Formally, following any history (si, ti, ai, 61, . . . , s^, tn) consisting of realized and 
fictitious states, messages and actions up to stage n, the strategy a' plays the mixed move 
cr(ti, ai, 61 . . . , t„) that would have been played by a, had the realized states been ti, . . . , t„. 

We now give some more details. Since the strategy r may feature complex statistical 
tests on the successive announcements, the notion of being statistically indistinguishable has 
to be interpreted in a restrictive sense. 

We prove in the Appendix the following lemma. 

Lemma 4 Assume Assumption A, and let n & M. he given. There exists an S-valued 
proces^^ {tn)n, such that: 

PI Conditional on Sn, the vector {ti, . . . ,tn) is independent of the future states s„+2, • • •)■ 

P2 The law of the sequence {tn)n is the same as the law of the sequence (s„)„. 
P3 The law of the pair {Sn,tn) is jj,, for each stage n G N. 
P4 The conditional law of Sn, given ti, . . . ,tn is | 

According to PI, the state t„ can be computed/simulated using only the information 
available at stage n: past and current states, and past fictitious states. This is a feasibility 
requirement that ensures that a' is well-defined. Condition P2 ensures that no statistical test 
can discriminate between the sequences (s„) and (t„). Condition P3 provides the desired 
coupling between s„ and t„. 

^'^The process (t„)„ is possibly defined on a probability space which is an enlargement of the one on which 
(s„)„ is defined. 
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We now proceed to show that the expected payoff induced by (a', r) is then equal to 
U{fi,y), as claimed. 

Below we will denote by Sk,tk,bk generic values of the random variables Sk,tk and b^, 
respectively. For any given stage n G N, the following sequence of equalities holds: 

E<x',t[^^(s„, b„,)] = ^ Pa',T(s„ = Sn, b„ = bn)u{Sn, &„) 

= ^ ^ ^ Pa',T{Sn,ti, . . . ,tn,bi, . . . ,bn)u{Sn,bn) 

Sn t\,...,tn bl,...,bn 

= ^ ^ ^ P^/,^(s„ I ti,...,t„,6i,...,6„)P^',^(ti,...,t„,6i,...,6„)^z(s 

S„ ti,...,t„ bl,...,bn 

= 5Z 5Z X] P(s„ I ti,...,t„)P^/,^(ti,...,t„,6i,...,6„)w(s„,6„) 

Sn t\,...,tn bl,...,bn 

= ^ ^ ^ KSn\tn)Pa',T{^l,---,tn,bi,...,bn)u{Sn,bn) 

Sn tl,---,tn bi,...,bn 

= fJ-{Sn\tn)Pa',T(tn,bn)u{Sn,bn), 



where (jll) holds because the variables (bi, . . . , b„) are conditionnally independent of s„ given 
(ti, . . . , t„), and (ED holds by P4. 

Using P2, and by the definition of a', the 5-discounted sum of Po-',r(tn = tn, bn = 6„) is 
equal to Po-t-(s„ = bn = bn), which is equal to /x(tn) x y{bn \ tn)- By ([7]) we now obtain 

75(0-', r) = I t)fi{t)y{b I t)u{s,b) = U{iJ,,y). 

s,t,b 

6 Further results and comments 
6.1 On the condition E{M) 7^ 

In the light of existing results for repeated games, it is not surprising that some non-empty 
interiority type of assumption is needed (see Mailath and Samuelson (2006) for a survey). 
As the next example illustrates, the conclusion of Theorem [1] fails to hold if E(jM.) = 0. 

Example 2 Let there be two states and two actions for the receiver. The payoffs in the 
two states are given by the tables in Figure 4- We assume that the successive states are 
independent and that the two states are equally likely. 
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0.5,1 



1,1 



0,0 



1,1 



State L State R 

Figure 4^ The game in Example\^ 

The strategy which plays r irrespective of the announcement is weakly dominant in 
the one-shot game, and thus, v'^ = 1. Consider now the stationary strategy y defined by 
y{l I L) = y{r \ R) = 1. The payoff vector U{iiQ,y) = (|, 1) is in E{Ai). However, we claim 
that (1,1) is the unique equilibrium payoff, irrespective of 6. Here is why. Consider any 
equilibrium {a, r). Plainly, the equilibrium payoff of the receiver is equal to 1. In particular, 
with probability 1 the receiver plays r whenever the current state is R. This implies that 
in every stage, and for a.e. past history, there is one (possibly history-dependent) message 
following which the receiver plays r, and which is assigned positive probability by a. But 
then, the sender gets a payoff 1 by assigning probability 1 to this specific message in every 
stage. ♦ 

As we stressed, the statement of Theorem [1] is unsatisfactory in one important respect: 
while non-empty interior requirements in existing Folk Theorems are generically satisfied, 
the condition E{M.) ^ does not hold generically, as the next example shows. 

Example 3 Consider the game depicted in Figure 5, where there are two states, and the 
receiver has two actions. 

I r I r 



1,1 



0,0 



1,1 



0,0 



State L State R 

Figure 5: The game in Example\^ 

Here, v"^ = 1, and the stationary strategy which plays / irrespective of the announce- 
ment is the only stationary strategy that satisfies C2. Hence, E{M.) contains a single payoff 
vector, (1,1), and E{M.) is empty. When payoffs are slightly perturbed, the strategy y^, 
remains the only strategy satisfying C2, therefore E{M.) = for any such perturbation. ♦ 

Example E] suggests that if all strategies y for which U{nQ,y) is in E{M.) are constant 
strategies, then the set E{M.) is empty, even when payoffs in the game are slightly perturbed. 
We build on this intuition, and introduce a new condition. 
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Condition B. There is a non-constant map y : S ^{B) such that U{no,y) G E{J^). 

If condition B is not met, then all equilibrium payoffs are babbling. 
In Theorem |3] below, we fix the transition function of the Markov chain p, and identify a 
game to a point in the space R,2x5xb payoff functions. 

Theorem 3 Let a game G he given. 

If condition B holds for G, then any neighborhood ofG contains a game G' with Eqi^M.) ^ 

0. 

// condition B does not hold for G, there is a neighborhood M of G such that, for every 
game in M , condition B does not hold. 

Theorem |3] allows us to complete the picture provided by Theorems [1] and |2l provided the 
underlying Markov chain satisfies Assumption A. Indeed, let G be a game. If Condition 
B holds for the game G, Theorems [T] and |2] provide a characterization of the limit set of 
equilibrium payoffs for games arbitrarily close to G. If condition B does not hold for the 
game G, then all games close enough to G have only babbling equilibrium payoffs. 

6.2 On the role of the randomizing device 

The randomizing device is not needed in the proof of Theorem |2] to implement payoffs 
U{fio,y), whenever y(- | s) is a pure strategy: it assigns probability 1 to some action b{s), 
for each s E S. However, as soon as y{- | s) is a truly mixed distribution for some state s, it 
may be impossible to dispense with the randomizing device, as we now argue by means of 
an example. 

Let there be two states, L and R. The successive states are drawn independently in every 
period, and each of the two states is equally likely. The receiver has three actions, denoted 
B = {/, m, r}. The payoffs are given in Figure 6. 



I 


m 


r 


I 


m 


r 


3,0 


0,4 


2,1 




1,-5 


4,-4 


2,1 



State L State R 

Figure 6: The payoffs of the players. 
Plainly, = 1. Define y^, to be the stationary strategy such that | R) assigns 
probability 1 to r, and | L) assigns probabilities | and | to / and m, respectively. Then 
U{fio,y) = (2, |), and one can verify that U{fio,y^,) G E{Ai) while E{Ai) ^ 0. Thus, using 
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Theorem [H the vector (2, |) can be approximated by sequential equihbrium payoffs, when 
players are sufficiently patient, provided a randomizing device is available. 

We now assume that such a device is not available. Since successive states are indepen- 
dent, the dynamic game can be viewed as a infinite repetition of the one-shot information 
transmission game. With this interpretation, an action of the sender in the one-shot game 
is a map x : S ^ A, while an action of the receiver is a map y : A ^ B. Given an action 
profile {x,y), payoffs are random, and take the value u{s,y{x{s))) with probability m(s), for 
s E S. Players then receive the public signal {x{s),y{x{s))). 

We will rely on Fudenberg, Levine and Maskin's (1994) characterization of the limit set 
of perfect public equilibrium (PPE) payoffs in repeated games with public signals. Some 
care is needed, as there are two dimensions according to which our repeated game does not 
fit into their setup. First, they assume that a player's payoff depends deterministically on 
his own action and on the public signal, while payoffs here depend randomly on the entire 
action profile {x,y). Second, their result is a characterization of public equilibrium payoffs, 
while we focus on sequential equilibrium payoffs. 

We briefly argue that their result nevertheless applies to our setting. On the one hand, 
their result is still valid for games where payoffs depend on the entire action proflle0 Next, 
it can be verified that the auxiliary game in which stage payoffs are defined to be the expected 
stage payoffs in our game (given the action profile) has the same set of PPE payoffs. Thus, 
their result provides a characterization of the limit set of PPE payoffs for our game. On the 
other hand, let (a, r) be a sequential equilibrium of our game, and define a public strategy 
profile (cT, f ) as follows. Let any public history h be given. At h, we let a play the expectation 
of the mixed move played by a, where the expectation is computed w.r.t. the belief held 
by the receiver at the information set which contains h. We define f{h) by exchanging the 
roles of the two players. It can be verified that {a, f) is a public perfect equilibrium of the 
repeated game. 

Fudenberg et al. (1994) showed that 7 G is a limit PPE payoff if and only if for all 
A G we have A • 7 < A;(A), where k{X) is the solution to a certain optimization problem 
P(A)§ 

We set 7 = (2, |), and we will show that it is not a PPE Payoff using the condition of 
Fudenberg et al. (1994) with A* = (0, 1). We now recall Fudenberg et al. (1994) definition 

-•^^This can be seen from their proof or, alternatively, deduced from Horner et al. (2009). 
^^Their result requires that a certain set have a non-empty interior, a condition that can be checked to be 
met here. 
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of k{X^,), and we will show that A* ■ 7 > A;(A*), implying that 7 is not a limit PPE payoff. 

We denote hj Z = A x B the set of public signals in our game. The quantity A;(A*) is 
defined as the value of the optimization problem V: 



sup V"^, 



where the supremum is taken over all {V^, V"^) G R^, and all : Z — t- R^, such that 

• 4''^{z) < for every z & Z; 

• {V^, V"^) is a Nash equilibrium payoff of the one-shot game, with payoff function defined 



for each action pair {x,y). 

Let (j) : Z ^ R^ be any map such that (p'^{z) < for each z & Z, and let (a, /3) be any 
(possibly mixed) equilibrium of the one-shot game (jH]), with payoff {V^ , V'^). We will prove 
that < |. We argue by contradiction, and assume that > |. We distinguish between 
two cases. 

Assume first that a : S ^ ^(^) is pooling: the distribution of messages is the same in 
both states. Then, since (p'^{z) < 0, the expected payoff of the receiver is not higher than 



Thus, < 1 < |, which is the desired contradiction. 

Assume next that a is not pooling. Up to a relabelling of the messages, we may then 
assume w.l.o.g. that the sender always tells the truth with positive probability. That is, 
a{s I s) > 0, for each s E S. We denote by /?(■ | s) the conditional distribution of the 
receiver's move under (a,/3), conditional on the state being s G S*. Denoting hj s ^ t the 
two states, the equilibrium property for the sender in the game ([8]) then implies that 



by: 




8 




J2 pip I {^\s, b) + h)) > 5^ t) {u\s, b) + b)) , 




with equality if a(- | s) assigns positive probabihty to both messages, and 



Pip I t) {u\t, h) + h)) > J2 Pib I s) {u\t, b) + 0^(5, h)) . 
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Using the two inequalities, one can verify that 

u\s, I s)) + u\t, P{- I t)) > u\s, P{- I t)) + u\t, P{- I s)). 

By Lemma [U condition C2 therefore holds for the stationary strategy /3. 

On the other hand, since 4>^{z) < for each the expected payoff V"^ to the receiver does 
not exceed U'^diQ, (3). Hence, [/^(//o,/3) > |. This readily implies that [/(/io,/3) G E{Ai). 

Next, one can verify that the highest payoff {/"^{fiQ, (5) to the receiver, over the whole set 
U{fio,(3) G E{Ai), is equal to |. In addition, the unique strategy /3 that achieves such a 
payoff is the strategy y*. Since the supports of | L) and | R) are distinct, it must 
therefore be that a is truth-telling: a{s | s) = 1 for each s. Therefore, /? is equal to y^,. 

Since V^^ > | and V"^ < U'^{fio,y*), one also has V"^ = [/^(/xq, ?/*)• In particular, the 
expectation of <p'^{z) under the equilibrium profile (a, (3) must be equal to zero. Since (p'^{z) < 
for each z, this implies that (p'^{z) = 0, for each public signal z that receives positive 
probability under {a, 13). 

Using this, we finally claim that the equilibrium condition for the receiver in the game 
dH]) is violated. Indeed, when told L, the strategy (3 = assigns positive probability to both 
/ and m. Hence, (j)'^{L,l) = (fP'{L.^m) = by the previous paragraph. On the other hand 
however, m) > 'u^(L, /), hence the receiver is not indifferent between both actions. This 
is the desired contradiction. 

6.3 Imperfect monitoring 

Let us assume here that successive states are independent. Results continue to hold if the 
receiver only observes a noisy, public version of the sender's message (provided the definition 
of y) is modified in an appropriate way). They still hold if the receiver observes a noisy, 
public signal of the current state, provided the individual rationality level v"^ is modified 
in the proper way. They also hold, without changes, if the sender only observes a noisy, 
public signal of the receiver's action. What happens in any of these variants when signals 
are private is beyond the scope of the paper. 

We briefly conclude this section by discussing the case where the sender fails to receive 
any information relative to the receiver's choices. In spite of this feature, the game does 
not reduce to a sequence of successive, independent, one-shot games, because of the abihty 
of the receiver to monitor the sender. In particular, it is easy to construct examples with 
equihbrium payoffs that lie outside of the convex hull of the set of equilibrium payoffs in the 
one- shot game. 
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We refer to the game where the sender does not observe the actions of the receiver as to 
the blind game. Denote by NE^ the set of all Nash equilibrium payoffs of the blind game. 
We prove that the value of monitoring is positive, in the sense that allowing the sender to 
monitor the receiver has a non- ambiguous effect on the equilibrium set. 

Proposition 2 The set NE\ is a subset of NE^. 

Proof. Let {a, r) be a Nash equilibrium of the blind game. Define r' to be the following 
strategy that depends only on the sender's announcements, and not on the receiver's past 
actions: after a sequence (oi, . . . , a„) of announcements, r' plays any action h & B with the 
probability that the n-th action of the receiver according to r is 6, conditional on the sender's 
announcements being (ai, . . . , a„): 

r'(ai, . . . ,a„)[6] = E [T(ai, 61, 02, &2, • • •, I 01,^2, • • • , a„] . 

In words, r' gets rid of the possible correlation between successive actions of the receiver, 
that may exist in the strategy r. 

We claim that the strategy profile (a, r') is a Nash equilibrium of the blind game. Indeed, 
r' is a best-reply to a because it induces the same payoff as r. a is a best-reply to r' because 
any strategy of the sender in the blind game induces the same expected payoff against r or 
r'. 

We next claim that the strategy profile (cr, r') is a Nash equilibrium of the non-blind 
game. Indeed, because under o", the sender does not condition his play on past actions of the 
receiver, and because r' is a best response to a in the blind game, it follows that r' is a best 
response to a in the non-blind game as well. Because the receiver's actions are conditionally 
independent, given the sender's announcements, any profitable deviation against t' in the 
non-blind game is also profitable in the blind game. ■ 

The inclusion is strict in general, as Example H] below shows. 

Example 4 There are two states S = {L,R}, and three actions for the receiver, B = 
{l,m,r}. The payoffs in the two states are given in Figure 7. 
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2,2 


0,0 


0,0 




0,0 


2,2 


0,3 



State L State R 

Figure 7: The game in Example^ 
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We claim that (2,2) is an equilibrium payoff when the sender observes the actions of 
the receiver, but it is no longer an equilibrium payoff when the sender does not observe the 
receiver's actions. 

Note first that = |, and that E{M) ^ 0. By Theorem [H (2,2) G E{M), so that 
(2, 2) e liminf^^i SEs C liminf^.,! A^^;^. 

We now argue that (2, 2) is bounded away from the set NE^. Indeed, assume to the 
contrary that there is some equilibrium profile (cr, r) of the blind game with a payoff close to 
(2, 2). In particular, with a probability close to one, there is a positive fraction of the stages 
in which the current state is R and the receiver plays m. Consider the strategy r' which 
plays as r, except that r' plays r whenever r would play m. Because the sender does not 
observe the receiver's actions, he cannot tell whether the receiver uses r or r', and therefore 
r' is a profitable deviation of the receiver: it yields the receiver payoff close to 2^. ♦ 

6.4 Relation to the one- shot game 

The characterization implies that every equilibrium payoff of the one-shot game remains an 
equilibrium payoff in the dynamic game, provided players are patient enough. This property 
is not obvious a priori, since the game is not a repeated game. In particular, it would 
typically fail to hold if the state were constant throughout the play. 

Let (cr, r) be an equilibrium of the one-shot game. Let y : A ^ ^i^) be the stationary 
strategy defined as 



Note that the expected payoff under (a, r) is U{fiQ,y). We claim that U{fio,y) G E{Ai), 
so that by Theorem [1] it is a sequential equilibrium payoff in the repeated game. Indeed, 
because the receiver can guarantee v"^ in the one-shot game, condition C2 holds. Because 
(T is a best reply to r in the one-shot game, the inequality in CI holds for every /i, and in 
particular for every fi E Ai. 

This result has the implication that the lowest equilibrium payoff of the sender in the 
repeated game cannot be higher than his lowest equilibrium payoff in the one shot game. As 
the example in Section [3] shows, it can in fact be strictly lower. 

On the other hand, the lowest equilibrium payoff of the receiver in both the one-shot 
game and the repeated game is equal to his babbling equilibrium payoff v^. 




28 



References 



[1] Athey S. and Bagwell K. (2008) Collusion with Persistent Cost Shocks. Econometrica, 
76, 493-540. 

[2] Aumann R.J. and Hart S. (2003) Long Cheap Talk. Econometrica, 71, 1619-1660. 

[3] Aumann R.J. and Maschler M.B. (1995) Repeated Games with Incomplete Information. 
The MIT Press. 

[4] Battaghni, M. (2005). Long-term contracting with Markovian consumers. American 
Economic Review, 95, 637-658. 

[5] Bochnak J., Coste M. and Roy M.F. (1998) Real Algebraic Geometry. Springer. 

[6] Crawford V.P. and Sobel J. (1982) Strategic Information Transmission. Econometrica, 
50, 1431-1451. 

[7] Escobar, J. F. and J. Toikka (2010) A Folk Theorem with Markovian Private Informa- 
tion, mimeo. 

[8] Farrell J. and Rabin M. (1996) Cheap talk. Journal of Economic Perspectives, 10, 103- 
118. 

[9] Forges F. and Koessler F. (2008) Long Persuasion Games. Journal of Economic Theory, 
143, 1-35. 

[10] Fudenberg D., Levine K. and Maskin E. (1994) The Folk Theorem with Imperfect Public 
Information. Econometrica, 62, 997-1040. 

[11] Golosov M., Skreta V., Tsyvinski A. and Wilson A. (2009) Dynamic Strategic Informa- 
tion Transmission. Preprint. 

[12] Green J.R. and Stokey N.L. (2007) A Two-Person Game of Information Transmission- 
star. Journal of Economic Theory, 135, 90-104. 

[13] Horner J., Rosenberg D., Solan E. and Vieille N. (2010) On a Markov Game with 
One-Sided Incomplete Information. Operations Research, forthcoming. 

[14] Horner J., Sugaya T., Takahashi S. and Vieille N. (2009) Recursive Methods in Dis- 
counted Stochastic Games: An Algorithm for 5 — > 1 and a Folk Theorem. Econometrica, 
forthcoming. 



29 



[15] Jackson, M. O. and H.F. Sonnenschein (2007) Overcoming Incentive Constraints by 
Linking Decisions. Econometrica, 75, 241-258. 

[16] Krishna V. and Morgan J. (2001) A Model of Expertise. Quarterly Journal of Eco- 
nomics, 116, 747-775. 

[17] Krishna V. and Morgan J. (2008) Contracting for Information under Imperfect Com- 
mitment. RAND Journal of Economics, 39, 905-925. 

[18] Mailath G.J. SamuelsonL. (2006) Repeated GTames and Reputations: Long-Run Rela- 
tionships. Oxford University Press. 

[19] Phelan C. (2006) Public Trust and Goverment Betrayal. Journal of Economic Theory, 
130, 27-43. 

[20] Renault, J. (2006) The Value of Markov Chain Games with Lack of Information on One 
Side. Mathematics of Operations Research, 31, 490-512. 

[21] Sobel J. (2009) Signaling Games. Encyclopedia of Complexity and Systems Science, 
Springer, 19, 8125-8139. 

[22] Wiseman T. (2008) Reputation and Impermanent Types. Games and Economic Behav- 
ior, 62, 190-210. 

Appendix 

A Proof of Lemma [1] 

To prove Lemma [1] we need the following description of Ai, which is of independent interest. 

A permutation matrix is a (square) matrix with entries in {0, 1}, such that each row 
and each column contains exactly one entry equal to 1. We denote by $ the set of 5" x 5" 
permutation matrices, and by / the matrix that corresponds to the identity permutation. 

Lemma 5 The set Ai{m) is equal to 

M{m) = (/io - / + CO $) n R^^^. 

Proof. The inclusion ^ is clear. We prove the reverse inclusion. Take in A4{m), and 
define the matrix J := /i -|- J — /ig in R'^^'^. J is a bistochastic matrix, hence it is a convex 
combination of permutation matrices. Since ^ = J — I + Hq, the result follows. ■ 
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Proof of Lemma [H We only prove that CI is equivalent to C'l. For every permutation 
over S denote by /i*^ G 5* x the matrix where the entry {s, t) is equal to 1 if t = 0(s), and is 
otherwise. Note that U\I,y) = Ese5 ^^(-5' I ^))' and U^{ii'*',y) = Ese5^H-5,y(- I (Pis)))- 
Assume first that C'l holds, and let fi G A^(m). By Lemma [5], can be written 
fi = fiQ — I + ctfli-P'^) where the are non negative real numbers that sum to one. 
Because is linear in /i, 

U\fi,y) = U\^o,y) - U\l,y) + 

By C'l, U^{I, y) > U^{ii'^, y) for every permutation 0, and therefore f/^(J, y) > a^U^{^'^ , y). 
It follows that U^{nQ,y) > U^{fi,y). Because this inequality holds for every /i G M.{n), CI 
holds. 

Assume now that CI holds. Fix a permutation 0, and define ij,^ = — si + e^'^, where 
e > 0. Because m has full support, one has ^ A^(m) provided e is sufficiently small. 
Now, by CI, for each such e, 

U\^o,y) > U\^s,y) = U\fio,y) - sU\l,y) + sU\fi'^,y). 

It follows that U^{fi'^,y) < eU^{I,y). As this inequality holds for every permutation 0, C'l 
holds. ■ 



B Complements to the proof of Theorem [T] 

The proof of Theorem [1] given in the text is almost complete. For completeness, we provide 
below the proofs of Proposition [1] and of Lemma [3l which are missing. 

We start by addressing the issue of designing a system of beliefs for the receiver that 
is consistent with cr*, and that satisfies an additional property. Since the game involves 
randomizing devices with uncountably many outcomes, the standard definition of consistency 
does not apply. We denote by A G A(S') a distribution with full support and, for r] < 0, we 
denote by o"^ the strategy that, following any history plays 77A + (1 — ri)(7^{hn). 

One can check that, for r] > 0, the beliefs of the receiver are uniquely defined by Bayes 



rule, and have a limit when r] = 



Note that, following any 



with To, the belief of the receiver in stage n is independent of 



listory that is inconsistent 

ni I 



^^And the convergence is uniform w.r.t. the receiver's information set. 

^^That is, should the sender fail to play the babbling announcement a, the receiver sill interprets the 
sender's announcements as babbling. 
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We denote by a strategy that coincides with tq as long as the sender does not deviate, 
and that plays in each later stage n an action that (i) maximizes the current expected payoff 
of the receiver, given the belief held by the receiver in stage n, and (ii) does not depend on 
the announcements made by the sender since the deviation took place. 

By construction, the strategy cr^, is sequentially rational at each information set of the 
sender, while the strategy is sequentially rational at each information set of the receiver 
that is inconsistent with tq. 

B.l Proof of Proposition [1] 

Assume w.l.o.g. that all payoffs belong to the interval [0, 1]. Define by (Jtruth the strategy of 
the sender that announces truthfully the current state in each stage n < N, and by Ttruth 
the strategy of the receiver that plays y{tn) in each stage n < N. Thus, Ttruth coincides with 
To until stage q. 

Let 7] be given, and set ^ = For every state s E S and every tt, G N, denote by 

Fn{s) the empirical frequency of visits to s up to (and including) stage n. Since the Markov 
chain is aperiodic, by the ergodic theorem there is A^o such that with probability at least 
1 — ^, F(i_^)jv('S) < iTT-NiS) for every state s E S, as soon as > Nq. It follows that tq 
coincides with Ttruth in the first (1 — ^)A stages, so that with probability at least 1 — i^. 

This implies that 

u\fi^,^^,,,ro,y)>u\fio,y)-{\s\ + i)t 

For fixed A, as 6 converges to 1, the discounted payoff in each block converges to the average 
payoff in that block, and therefore for 6 sufficiently large 

-fl{atruth,To) > U\fio,y) - {\S\ + 2)^. 

Because ctq is a best reply to tq, we deduce that 

7l(cro,ro) > jliatruth,ro) > U\fio,y) - i\S\ + 2)^ = f/^(/io,y) - V- 

We again use the fact that, for fixed A, as S goes to 1, the payoff 75(^0; ^o) converges to 
U{fiao,To,y) to deduce that 

U^{f^ao,To,y)>U^{f^o,y)-V- (9) 

For fixed A^, and for every S, the marginal distributions of //o-q.to G A(S' x A) on S* and 
A are respectively equal to m and to mAr. 
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Since the approximation converges to m as — )■ +00, the distribution fiao,To con- 
verges to the set M. of copulas. Using Lemma |3l Proposition [1] therefore follows from (|9]). 

B.2 Proof of Lemma [3] 

Let a copula /i G be given. Present /i as a convex combination of the extreme points 
(/ie)e of A^: /i = «e/^e, with ctg > and = 1. Recall that /iq is one of the 

extreme points of Ai. 

On the one hand, since is bi-linear, one has 

U\^o, y) - U\fi, y) = eU\fio, yo) + (1 - s)U\^o, yi) - eU\fi, yo) - (1 - s)U\fi, yi) 

> e{U\f,o,yo)-U\fi,yo)) (10) 

= e {{1 - ao)U\fXo,yo) - ^ aeU\fXe,yo)] (H) 

> £(l-ao)ci, (12) 

where the inequality ( 1T0|) holds because yi G Y{Ai) and by CI. 
On the other hand, one has ii — = Xl^^eAie '^e(/Ue — /^o), hence 

ll/U - /Uolli < C2 ^ tte = C2(l - ao). (13) 
The result follows from ffT2l) and flT3ll. 



C Complements to the proof of Theorem [2 

We here prove Lemma HI For clarity, we introduce yet another copy T of the set S. Intuitively, 
fictituous states are T-valued, while realized ones are iS-valued. 

Define M' C A(S' x T) to be the set of distributions /i G Al such that the following 
property P holds: 

Property P. For every (s,t) G x T, one has 

Y,^^{s'\t)p{s\s') = Y,^^{s\t')p{t'\t). (14) 

s'&s t'eT 

We will prove 

Lemma 6 Under Assumption A, the set Al' coincides with the set Al. 
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Lemma 7 Let jj E Ai' be given. There exists an S -valued procesP'i {tn)n, such that: 
PI The law of the sequence (t„)„ is the same as the law of the sequence (s„)„. 
P2 The law of the pair tn) is fi, for each stage n G N. 
P3 The conditional law of Sn, given ti, . . . , t„ is fi{- \ t„,). 

P4 Conditional on Sn, the vector {ti, . . . ,tn) is independent of the future states (sn+i, s„+2, . . . 

We emphasize that only Lemma [6] makes use of Assumption A. This has the following 
consequence. Given fi E Ai', using Lemma [7] and the construction of the paper, one has 
U^{fio,y) > U^{fj,,y). Thus, the conclusion U^{fio,y) = max^£_A4' U^ifJ^y) holds, irrespective 
of whether Assumption A is met or not. 

C.l Proof of Lemma [7] 

Let /X E M' be given, and define /i G A(T x x T) by 

m(t') 

li{t\ s, t) = fi{s, t)p{t I i')^^^ (i', s,t) eT X S xT. (15) 

For every two indices i,j E {1, 2, 3} with i < j, denote by /2j j the marginal of fi on the i-th 
and j-th coordinates. 

We will use the following properties of ft. 

Lemma 8 One has 

1- P'2,3 = yu; 

2. /ii,3(t',t) = m{t')p{t I t') for every t,t' E T; 

3. fxi,2{t', s') = Xlses Hsis, t')p{s' I s), for each t' eT,s' E S; 

4. fi{s I t', t) = fi{s I t) for each it\ s,t) eT x S xT. 

Proof. We prove the four claims in turn. Let s,t E S x T he given. One has 

t'&T t'£T 



m{t) 

^Y.P^t\t')m{t') = ^^{s^l 
m{t) 



^''The process (t„)„ is possibly defined on a probability space which is an enlargement of the one on which 
(s„)„ is defined. 
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which proves the first claim. 

To prove the second claim, let t',t E T be given. One has 

fii,3{t',t) = y2mt',s,t) = y2fi{s,t)p{t \ t')'^=p{t \ t')m{t'), 

^-^ ^-^ m[t) 



ses s€S 



where the last equality holds since the marginal distribution of /i on is m. 

We turn to the third claim. Let t' & T, s' E S be given. By the first claim, and since 
/i G Ai', one has 

J2f^2,3is,t')pis' I s) = I s) = mit')J2Ks' I t)pit I t'). (16) 

ses ses teT 

On the other hand, 

fiiAt'.s) = Y.'^{t\s\t) = $^^(.',t)p(t I t')^. (17) 

The third claim follows from f|T6|) and f|T7j) . 

Finally, let {f , s,t) E T x S x T he given. By the second claim, 

/il,3(t',t) p(t|t')"^(^) "^(0 
and the fourth claim follows. ■ 

We construct the sequence (t„)„ as follows. The initial values to and ti are drawn 
according to the conditional distribution /u(-|si) G A(T x T). For tt, 7^ 2, t„ is drawn 
according to the conditional distribution /2(- | t„_i,s„). In this construction, to is used to 
unify the treatment of Si with that of (s„)„>2. Property P4 thus holds by construction. 
Properties PI and P2 follow from the next lemma. 

Lemma 9 The law of (t„_i, s^, tn) is equal to fi, for each stage n> 1. 

Proof. We argue by induction. Observe that the law of Si is equal to m. Therefore, 

P((to, si, ti) = {t', s, t)) = m{s)fi{t', t\s) = Ji{t', s, t). 

Assume that the claim holds for some n G N. We will prove that the law of (t„, s^+i) is 
then equal to /ii^2- This follows from the following sequence of equalities, which holds for 
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every t' E T, s & S: 

P((t„, s„+i) = {t', s)) = ^ P((Sn, t„, s„+i) = {s', t', s)) 

s'es 

= ^P((s„,t,) = {s',t')) X P(s„+i = s|(s„,t„) = {s',t)) 

s'eS 

= ^ ^2,3(5', t')p{s I S') = /ii,2(t', S), 
s'£S 

where the last equahty follows from Lemma [H](3) and P4. Since the conditional law of 
tn+i given (t„, s„+i) is equal to | t„, s„+i), this yields the claim for n + 1. m 

Finally, property P3 follows from the second part of the next lemma. The first part of 
the lemma is needed to the proof of the second part. 

Lemma 10 (1) The conditional law of tn given {to, . . . , in-i) coincides with the conditional 
law of tn given 

(2) The conditional law of Sn given (to, • • • , tn-i, tn) coincides with the conditional law of 
Sn given tn- 

Proof. The proof is by induction. For n = 1, the first statement trivially holds, while the 
second statement holds by Lemma [8](1). Assume that the claim holds for some n G N. For 
brevity, we denote by t„, s„, ■ ■ ■ generic values of t„, s„ ■ ■ ■ , and we write P(t„, s„) instead of 

P((trn = (tn,Sn))- 

Observe first that by the definition of (t„), 

P{tn+1 I ^0) • • • ! tn) = P('Sn+l | ^0) • • • ; ^n)P(^n+l | ^0; • • • 5 ^nj ■Sn+l) 

= ^ P{Sn+l\to,...,tn) X flitn+l\tn,Sn+l). (18) 

Moreover, 

P('Sn+l I ^0; • • • ; ^n) = P('5n | ^0, • • • 5 ^n)P('5n+l | tQ, . . . , tn) 

Sn&S 

= ^ P(S„ I to, • • • ,^n) X P(S„+1 I S„) 

= ^ P(S„ I tn) X p{Sn+l I Sn), (19) 

s„es 
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where the last equahty holds by the induction hypothesis. Note that the right-hand side of 
(fT9|) is independent of (to)^i) • • • i^n-i), and therefore 

P(S„+1 I to,...,tn) =P(Sn+l I tn). (20) 

Plugging ( !T9|) in ( !T8|) . one obtains 

P(^ra+l I to, ■ ■ ■ ,i ■Sra+1 I Sn) X fi{tn+l \ 

The right hand side is independent of ti, . . . hence it is equal to P(tn+i I tn), and the 

first part of the lemma follows. 

We turn to the second statement. One has 

-r,/ I , , _ P{Sn+l, tn+1 \ tf), . . . , tn) _ P(s„+i \ to, . . . , tn) X P(t„_|_i | to, • • • , 

"\Sri+l I Iq, . . . ,t 



n+l, 



P(^n+1 I ^0, • • • ) tn) P(t„+i I to) • • • ) tn) 

H{Sn+l,tn+l) m{tn) 



P('5n+1 


tn)^{tn+l 




P(^n+1 


tn 



P s 



n+1 I ''raj 



^(^n,Sn+l) "^(Wl) 



— -T I tn+lj-i 



m{tn+i) 

where the third equality holds by fl20l) . the construction of (t„)„ and the first claim, and the 
fourth equality holds by f|T5|) . This concludes the proof of the induction step. ■ 
The proof of Lemma [7] is now completed. 

C.2 Proof of Lemma [6] 

We here verify that if Assumption A holds then M. = M.' . Let p be a transition function 
such that | s) = as' for every two states s 7^ s', andp(s | s) = 1 — ^^a^'- Set C = ^^Os- 

One can verify that the invariant measure of p is given by m{s) = ^ for each s G S". 
Let & Ai. We will prove that for every (t, s') E T x S, the equality 

J2f^is\t)pis'\s) = J2f^i-^'\t')Pint) (21) 

ses t'&T 

holds. Fix t E T and s' G 5". Observe that 



as'n{s I t) 

^s' / s^s 



J2t^is\t)p{s'\s) = fxis'\t)ll-J2^s]+J2 

seS \ s^s' J s^s' 

= /i(s' I t) I 1 - ^ a J + ay (1 - fi{s' I t)) 



as' + nis' \t){l-C). (22) 
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On the other hand, one has 

J2 f,{s' \ t')pit' \ t) = M^'U) +E^(^'i^>*' (23) 

i'eT \ t'^t J t'^t 

= ii{s' \t){l-C + at) + ^^^{s' \t')at'. (24) 



t'^t 



When subtracting ( l23ll from ( l22l) one obtains 



yu(s I t)p(s' I s) — I t')p{t' I t) = as' — fi{s' I t)af — | t')at' 



t'G5 

= I t')"i(t') (25) 

= as' — Cm{s') = 0, 

where f l25|) and fl26|) hold since a(s) = Cm(s) for every s E S. This proves fj2T|) . as desired. 



D Proof of Theorem [3 



The proof of Theorem [3] consists of two independent parts. We first prove that, if condition 
B does not hold for some game G, then it does not hold throughout some neighborhood of 
G. 

Proposition 3 Let G be a game that does not satisfy condition B. Then there is a neigh- 
borhood M of G such that no game in M satisfies condition B. 

Proof. The proof relies on the theory of semi-algebraic sets. We refer to Bochnak, Coste 
and Roy (1998) for the results used below. Recall that the set of extreme points of the 
polytope is denoted by A^e- 

We will use the following two properties, that hold for constant functions y : S ^ ^{B)- 

Rl. If 2/ : — 7- A(i?) is constant, then U^{iiq, y) = U^{fi, y) for every jj, G M.. 
R2. Uy : S ^ A{B) is constant, then U'^{^o,y) < v^. 

Property Rl holds because when y is constant, the payoff is independent of the sender's 
announcements. Property R2 holds because v"^ is the maximum of U^{fio, y) over all constant 
functions y. 
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Given a payoff function u : S x B ^ 'R?, we denote by S{u) the system of inequalities 



U'^{l^o,y) > vl and U\i^o,y) > U\l^,y),ior all E M 



ei 



with unknowns y : S —f A{B), where = max^es U'^^jiQ^b) is the min-max value of the 
receiver in the game with payoffs u. 

We say that a vector y G R"^^^ is constant if y{s, h) only depends on h. 

Let u denote the payoff function of G. By assumption, any solution y to S{u) is con- 
stant. We will show that this implies that all solutions to S{u) are constant, for all ii in a 
neighborhood of u. 

Assume to the contrary that for every e: > there is a payoff function G R^^'^^^) such 
that (i) ||w — Ue\\ < s, and (ii) the system S{ue) has a non-constant solution e R'^^-^. 

This implies that there is a semi-algebraic map e e (0,1) i-^ {ucVe) such that (i) 
lim£_^o u-e — u, and (ii) y^ is a non-constant solution to S{us) for every e > small enough. 

In particular, the map s ^ y^ has an expansion to a Puiseux series in a neighborhood of 
zero: there exist £o > 0, a natural number r and vectors yk G R'^^-^ for A; > such that 



for every e e (0, £o), and a similar expansion exists for the map £ ^ u^. 

Note that yo — lim^^Qy^. This implies in particular that yo{-,s) e A(S) for every s & S, 
and that yo is a solution to S{u). In particular, yg is constant. 

Because ye{--,s) e A(S), it follows that ^ij^BV^i^ I ^) — ^ every e > and every 
s e S", so that X^bes V^i^ \ ^) — ^ every k >1 and every s e S. 

Let / > be the maximal integer such that yo,yi, ■ ■ ■ ,yi are constant functions. Because 
ye is non constant for every e > 0, we have / < oo. Define a vector d e R^ by 



oo 



Ve 




k=0 



d{b) = mm yi+i{b,s), WbeB. 



Note that 



oo 




(26) 



I \ oo 

^s-yk + + e-^{yi+i - d) + ^ e^yu. 



(27) 
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The first term (Yl''k=o^'^yk + ^^^'^ is independent of s, and all its coordinates are non- 
negative because is non-negative for every e > 0. Set 

Then Ze{-,s) G A{B) for every s & S, and Ze is independent of s. Set 

Then 10(5) e A(B) and w is non-constant. We will show that w solves S{u), contradicting 
the assumption that all solutions of S{u) are constant. 

By R2, for every e > we have U'^{ijlq,z^) < v^^. But u{fJ,o,ye) > v^^i and is a 
convex combination of z^, w, and a "tail" which is of a lower order of £; by taking the limit 
£ — >■ and using v^^ — >■ v"^ we obtain U'^{fj,o,w) > v^. 

Fix \x e Ale- By Rl it follows that U^{ijLo,z^) — U^{n,Zs). Because U^{ijLo,y^) > 
U^if^i Us): it follows for the same reasoning as above that U^ifiQ, w) > U^i^-, w). ■ 

We turn to the second part of the proof. 

Proposition 4 Let G he a game such that condition B holds. Then any neighborhood of 
the game G contains a game G' such that Eg'{M.) 7^ 0. 

Proof. The proof combines three independent lemmas. We first show that there are 
perturbations of such that the inequality in (i) holds strictly for the perturbed game. 
Next, we show that the map y may be assumed to be one-to-one. Finally, we construct 
perturbations of such that the inequalities in (ii) will be strict. 

Lemma 11 Let G be a game with payoff function u, and lety : S ^ ^(-B) he a non-constant 
function such that U'^{fio,y) > v^. Then, any neighborhood of v? contains payoff functions 
such that tJ^{fj,o,y) > v^. 

Proof. Define P e A(S' x B) by P(s, h) := m{s)y{b \ s), ior s e S,b e B, and let e > 
be given. We abuse notations and still denote by P the two marginals of P over S and B. 
Note that P{s) = m{s) > for each s e S. Define : S x B ^ K by u'^{s,b) ^ u^{s, b) if 
P{b) = 0, and 

u\s,b)^u\s,b)+e^^^ ifF(6)>0. 
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We claim that f/^(/io, y) > v"^. Since e is arbitrary, the result will follow. Note first that, 
for 6 G -B such that P{b) > 0, one has 

Hence, = v^+e (see Eq. ([T])). On the other hand, since y{h \ s) = = = P{p I 



ses,beB 



ses beB 

(q(b))'^ 

Viewed as a function of the probability distribution q G A(i?), the expression 

is strictly convex, and admits a unique minimum equal to 1, when q = P. Thus, for fixed 
state s E S, one has J2b€B ^p{b) — with a strict inequality whenever the conditional 
distribution P{- \ s) differs from P. Since y is non-constant, there exist one state s such that 
P{- \ s) ^ P. Therefore, 

U^ixo, y) > U\ixo, y) + e>v^ + e = v\ 

as desired. ■ 



Lemma 12 Let G be a game with payoff function u, and let y : S ^ ^{B) be such that 
U^{fio,y) > U^{fi,y) for each fj, G Ai. Then, any neighborhood of y in R'^^^ contains a 
one-to-one function y : S ^ ^(-B) such that U^{fio, y) > U^{fi, y) for each /i G A^. 

Proof. It suffices to show the existence of a one-to-one map z : S ^{B) such that 
[/^(/io, z) > U^{fJ^, z) for each G A^. Indeed, the conclusion of the lemma then follows by 
setting y = {1 — e)y + ez, ioi e > Q small enough. 

Let {zs)s&s be arbitrary distinct elements of A(i?). Let be a permutation over S 
that maximizes the sum 'U"'^(s, 2;^(s)) over all permutations -0, and set Zs = ^^(s)- 

s£S 

construction, one has 

^^U^S^Zs) > Yu\s,Z^(s)), 

ses s£S 

for every permutation (p over S. By Lemma [H this implies U^{fio,z) > U^{n,z) for every 
fi E J^, as desired. ■ 
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Lemma 13 Let G be a game with payoff function u, and let y : S ^ ^{B) be a one-to-one 
map such that U^{fiQ,y) > U^{^. y) for each /i G A^. Then, any neighborhood of contains 
payoff functions such that U^{fio,y) > U^{fi,y) for each fi E Ai, jj. fiQ. 

Note that the existence of a stationary strategy y that satisfies the requirements follows from 
Lemma [121 

Proof. Let G, u and y be as stated. Given e > 0, we define : S x B ^ Hhy 

u^{s, b) = u^{s, b) + ey{b \ s). 

We will prove that for every e > 0, one has f/^(/io, y) > U^ifJ-, y) for each n E M. \ {/Uq}. 

Given a permutation (j) over S", we denote by G R"^^-^ the vector whose (s, b)- 
component is equal to y{b \ (t){s)). Then, 

Y,u\s,y{-\4^m = y{b\Hs))u{s,b) (28) 

ses ses,beB 

= J2 yib\<P{s)Hs,b)+e J2 yib\<Pis))y{b\s) (29) 

seS,beB seS,beB 

= J2u\s,y{-\cl){s))+e{Yr,,Y^), (30) 

where (l/di = YliseSbeBV^b \ (f){s))y{b \ s) is the standard scalar product in R'^^^. 

Since y is one-to-one, the vectors Y^ and Yj^ are not co-linear as soon as 7^ Id. By 
Cauchy-Schwarz inequality, it follows that 

{Ym,Y^) < \\YMh\\Y42 = WYidf = {Ym,Ym) (31) 

where the first equality holds since the components of Y^ are obtained by permuting the 
components of Yjd- 

On the other hand, observe that by Lemma [1], one has 

J2^\s,y{-\<P{s))<J2^\s,y{-\s)). (32) 

seS ses 

Plugging ( l32|l into ( l28l) . one obtains 

J2^\s,yi- I <P{s)) < 5^w^(s,?/(- I s) + {Ym,Yj,) = J2^\s,y{- \ s)). 

ses ses ses 

By Lemma[T]this yields U^{fio,y) > U^{^,y), for every ^q'ui Ai, as desired. ■ 
The proof of Proposition S] follows from Lemmas [TTl [T2] and [T31 ■ 
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